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ABSTRACT OF THE INVENTION 

The present invention relates to a human PAB II 
gene containing transcribed polymorphic GCG repeat, 
which comprises a sequence as set forth in Fig. 4, 
which includes introns and flanking genomic sequence. 
The allelic variants of GCG repeat of the human PAB II 
gene are associated with a disease related with protein 
accumulation in nucleus, such as polyalanine 
accumulation, a disease related with swallowing 
difficulties, such as oculopharyngeal muscular 
dystrophy. The present invention also relates to a 
method for the diagnosis of a disease with protein 
accumulation in nucleus, which comprises the steps of: 

a) obtaining a nucleic acid sample of said patient; and 

b) determining allelic variants of GCG repeat of the 
gene of claim 1, and wherein long allelic variants are 
indicative of a disease related with protein 
accumulation in nucleus. 
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JUN 2 7 2003 

TECH CEH1ER 1600/2900 



SHORT GCG EXPANSIONS IN THE PAB II GENE FOR OCULO- 
PHARYNGEAL MUSCULAR DYSTROPHY AND DIAGNOSTIC THEREOF 



BACKGROUND OF THE INVENTION 

( a ) Field of the Invention 

The invention relates to PAB II gene, and its 
uses thereof for the diagnosis, prognosis and treatment 
of a disease related with protein accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 

(b) Description of Prior Art 

Autosomal dominant oculopharyngeal muscular 
dystrophy (OPMD) is an adult-onset disease with a world- 
wide distribution. It usually presents in the sixth 
decade with progressive swallowing difficulties 
(dysphagia), eye lid drooping (ptosis) and proximal 
limb weakness. Unique nuclear filament inclusions in 
skeletal muscle fibers are its pathological hallmark 
(Tome, F.M.S. & Fardeau, Acta Neuropath. 43./ 85-87 
(1980))- We isolated the poly(A) binding protein II 
(PAB II) gene from a 217 kb candidate interval in 
chromosome 14qll. A (GCG) 6 repeat encoding a 
polyalanine tract located at the N-terminus of the 
protein was expanded to (GCG) 8-13 in the 144 OPMD 
families screened- More severe phenotypes were observed 
in compound heterozygotes for the (GCG) 9 mutation and a 
(GCG) 7 allele found in . 2% of the population, whereas 
homozygosity for the (GCG) 7 allele leads to autosomal 
recessive OPMD- Thus the (GCG) 7 allele is an example of 
a polymorphism which can act as either a modifier of a 
dominant phenotype or as a recessive mutation. 
Pathological expansions of the polyalanine tract may 
cause mutated PAB II oligomers to accumulate as 
filament inclusions in nuclei. 

It would be highly desirable to be provided 
with a tool for the diagnosis, prognosis and treatment 
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of a disease related with polyalanine accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 

SWMARY O? THE INVENTION 

One aim of the present invention is to provide 
a tool for the diagnosis, prognosis and treatment of a 
disease related with polyalanine accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy - 

In accordance with the present invention there 
is provided a human PAB II gene containing transcribed 
polymorphic GCG repeat, which comprises a sequence as 
set forth in Fig. 4, which includes introns and 
flanking genomic sequence. 

The allelic variants of GCG repeat of the human 
PAB II gene are associated with a disease related with 
protein accumulation in nucleus, such as polyalanine 
accumulation, or with a disease related with swallowing 
difficulties, such as oculopharyngeal muscular 
dystrophy. 

In accordance with the present invention there 
is also provided a method for the diagnosis of a 
disease with protein accumulation in nucleus, which 
comprises the steps of: 

a) obtaining a nucleic acid sample of said 
patient ; and 

b) determining allelic variants of GCG repeat of 
the gene of the human PAB II gene, and wherein 
long allelic variants are indicative of a 
disease related with protein accumulation in 
nucleus, such as polyalanine accumulation and 
oculopharyngeal muscular dystrophy. 

The long allelic variants have from about 245 
to about 263 bp in length. 

In accordance with the present invention there 
is also provided a non-human mammal model for the PAB 
II gene of the human PAB II gene, whose germ cells and 
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somatic cells are modified to express at least one 
allelic variant of the PAB II gene and wherein said 
allelic variant of the PAB II being introduced into the 
mammal, or an ancestor of the mammal, at an embryonic 
stage . 

In accordance with the present invention there 
is also provided a method for the screening of 
therapeutic agents for the prevention and/or treatment 
of oculopharyngeal muscular dystrophy, which comprises 
the steps of: 

a) administering said therapeutic agents to the 
non-human mammal of the present invention or 
oculopharyngeal muscular dystrophy patients; 
and 

b) evaluating the prevention and/or treatment of 
development of oculopharyngeal muscular 
dystrophy in said mammal or said patients . 

In accordance with the present invention there 
is also provided a method to identify genes part of or 
interacting with a biochemical pathway affected by PAB 
II gene, which comprises the steps of: 

a) designing probes and/or primers using the hGTl 
gene of the PAB II gene and screening 
oculopharyngeal muscular dystrophy patients 
samples with said probes and/or primers; and 

b) evaluating the identified gene role in 
oculopharyngeal muscular dystrophy patients. 

BRIEF DESCRIPTION OP THE DRAWINGS 

Figs, 1A-B illustrate the positional cloning of 

the PAB II gene; 

Figs. 2A-G illustrate the OPMD (GCG)n expansion 

sizes and sequence of mutations; 

Fig. 3 illustrates the age distribution of 

swallowing time (st) for French Canadian OPMD carriers 

of the ( GCG ) g mutation; and 



CA 02218199 1997-12-09 



- 4 - 

Fig. 4 illustrates the nucleotide sequence of 
human poly(A) binding protein II (hPAB II). 

DETAILED DESCRIPTION OF THE INVENTION 

In order to identify the gene mutated in OPMD, 
we constructed a 350 kb cosmid contig between flanking 
markers D14S990 and D14S1457 (Fig. 1A) . Positions of 
the PAB II selected cDNA clones in relation to the 
EcoRI restriction map and the Genealogy-based Estimate 
of Historical Meiosis (GEHM) -derived candidate interval 
(Rommens, J.M. et al., in Proceedings of the third 
international workshop on the identification of 
transcribed sequences (eds. Hochgeschwender , U- & 
Gardiner, K. ) 65-79 (Plenum, New York, 1994)). 

The human poly (A) binding protein II gene (PAB 
II) is encoded by the nucleotide sequence as set forth 
in Pig. 4. 

Twenty-five cDNAs were isolated by cDNA 
selection from the candidate interval (Rommens, J-M. et 
al., in Proceedings of the third international workshop 
on the identification of transcribed sequences (eds. 
Hochgeschwender, U. & Gardiner, K. ) 65-79 (Plenum, New 
York, 1994)). Three of these hybridized to a common 20 
kb EcoRI restriction fragment and showed high sequence 
homology to the bovine poly (A) binding protein II 
gene(bPAB II) (Fig. 1A) . The PAB II gene appeared to be 
a good candidate for OPMD because it mapped to the 
genetically defined 0.26 cM candidate interval in 14qll 
(Fig. 1A) , its mRNA showed a high level of expression 
in skeletal muscle, and the PAB II protein is 
exclusively localized to the nucleus (Krause, S. et 
al., Exp. Cell Res. 214 , 75-82 (1994)) where it acts as 
a factor in mRNA polyadenylation (Whale, E., Cell 66 , 
759-768 (1991); Whale, E. et al., J. Biol. Chem. 268 , 
2937-2945 (1993); Bienroth, S. et al., EMBO J. 12, 585- 
594 (1993)). 



CA 02218199 1997-12-09 



- 5 - 

We subcloned a 8 kb Hindi I I genomic fragment 
containing the PAB II gene, and sequenced 6002 bp 
(GenBank: AF026029) (Nemeth, A. et al., Nucleic Acids 
Res. 23, 4034-4041 (1995)) (Fig. IB). Genomic structure 
of the PAB II gene, and position of the OPMD (GCG)n 
expansions. Exons are numbered. Introns 1 and 6 are 
variably present in 60% of cDNA clones. ORF, open 
reading frame; cen, centromere and tel, telomere. 

The coding sequence was based on the previously 
published bovine sequence (GenBank: X89969) and the 
sequence of 31 human cDNAs and ESTs. The gene is 
composed of 7 exons and is transcribed in the cen-qter 
orientation (Fig. IB). Multiple splice variants are 
found in ESTs and on Northern blots (Nemeth, A. et al., 
Nucleic Acids Res. 23, 4034-4041 (1995)). In 
particular, introns 1 and 6 are present in more than 
60% of clones (Fig. IBM Nemeth, A. et al., Nucleic 
Acids Res. 23., 4034-4041 (1995)). The coding and 
protein sequences are highly conserved between human, 
bovine and mouse (GenBank: U93050). 93% of the PAB II 
sequence was readily amenable to RT-PCR- or genomic- 
SSCP screening. No mutations were uncovered using both 
techniques. However, a 400 bp region of exon 1 
containing the start codon could not be readily 
amplified. This region is 80% GC rich. It includes a 
(GCG)6 repeat which codes for the first six alanines of 
a homopolymeric stretch of 10 (Fig. 2G). Nucleotide 
sequence of the mutated region of PAB II. Amino acid 
sequences of the N-terminus polyalanine stretch and 
position of the OPMD alanine insertions. 

Special conditions were designed to amplify by 
PGR a 242 bp genomic fragment including this GCG- 
repeat. The (GCG)6 allele was found in 98% of French 
Canadian non-OPMD control chromosomes, whereas 2% of 
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chromosomes carried a (GCG>7 polymorphism (n=86) 
(Brais, B. et al M Hum. Mol. Genet. 4, 429-434 (1995)). 

Screening OPMD cases belonging to 144 families 
showed in all cases a PCR product larger by 6 to 21 bp 
than that found in controls (Fig. 2A) . (GCG) 6 normal 
allele (N) and the six different (GCG) n expansions 
observed in 144 families. 

Sequencing of these fragments revealed that the 
increased sizes were due to expansions of the GCG 
repeat (Pig. 2G) . Fig. 2F shows the sequence of the 
(GCG) 9 French Canadian expansion in a heterozygous 
parent and his homozygous child. Partial sequence of 
exon 1 in a normal (GCG) 6 control (N), a heterozygote 
(ht.) and a homozygote (hm. ) for the (GCG) 9-repeat 
mutation. The number of families sharing the different 
(GCG)n-repeats expansions is shown in Table 1. 



Table 1 

Number of families sharing the different dominant 
(GCG)n OPMD mutations 



Mutations 


Polyalanine 


Families 


(GCG) 8 


12 


4 


(GCG) 9 


13 


99 


(GCG) 10 


14 


19 


(GCG) 11 


15 


16 


(GCG) 12 


16 


5 


(GCG) 13 


17 


1 


Total 




144 



I, 10 alanineresidues in normal PAB II. 

The (GCG) 9 expansion shared by 70 French 
Canadian families is the most frequent mutation we 
observed (Table 1). The (GCG) 9 expansion is quite 
stable, with a single doubling observed in family F151 
in an estimated 598 French Canadian meioses (Fig. 2C). 
The doubling of the French Canadian (GCG) 9 expansion is 
demonstrated in Family F151. 
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This contrasts with, the unstable nature of 
previously described disease-causing triplet-repeats 
(Rosenberg, R.N., New Eng. J. Med. 335 , 1222-1224 
(1996) ) . 

Genotyping of all the participants in our 
clinical study of French Canadian OPMD provided 
molecular insights into the clinical variability 
observed in this condition. The genotypes for both 
copies of the PAB II mutated region were added to an 
anonymous version of our clinical database of 176 
(GCG)9 mutation carriers (Brais, B. et al., Hum. Mol. 
Genet. 4, 429-434 (1995)). Severity of the phenotype 
can be assessed by the swallowing time (st) in seconds 
taken to drink 80 cc of ice-cold water (Brais, B. et 
al., Hum. Mol. Genet. 4, 429-434 (1995); Bouchard, J.- 
P. et al-, Can.- J. Neurol. Scl. 1JJ, 296-297 (1992)). 
The late onset and progressive nature of the muscular 
dystrophy is clearly illustrated in heterozygous 
carriers of the (GCG)9 mutation (bold curve in Fig. 3) 
when compared the average st of control (GCG)6 
homozygous participants (n=7 6, thinner line in Pig. 3). 
The bold curve represents the average OPMD st for 
carriers of only one copy of the (GCG)g mutation 
(n=169), while the thinner line corresponds to the 
average st for (GCG)6 homozygous normal controls ( n=76 ) - 
The black dot corresponds to the st value for 
individual VIII. Roman numerals refer to individual 
cases shown in Figs. 2B,2D and discussed in the text. 
Genotype of a homozygous (GCG>9 case and her parents 
(Fig. 2B). Independent segregation of the (GCG)7 
allele. Case V has a more severe OPMD phenotype 
(Fig. 2D). 

Two groups of genotypically distinct OPMD cases 
have more severe swallowing difficulties. Individuals 
I, II, and III have an early-onset disease and are 
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homozygous for the (GCG)9 expansion (P < 10" 5 ) 
(Figs- 2B, F). Cases IV, V, VI and VII have more severe 
phenotypes and are compound heterozygotes for the 
(GCG)9 mutation and the (GCG)7 polymorphism (P < 10~ 5 ). 
In Fig. 2D the independent segregation of the two 
alleles is shown. Case V, who inherited the French 
Canadian (GCG)9 mutation and the (GCG)7 polymorphism, 
is more symptomatic than his brother VIII who carries 
the (GCG)9 mutation and a normal (GCG)6 allele 
(Figs- 2D and 3). The (GCG)7 polymorphism thus appears 
to be a modifier of severity of dominant OPMD. 
Furthermore, the (GCG)7 allele can act as a recessive 
mutation. This was documented in the French patient IX 
who inherited two copies of the (GCG)7 polymorphism and 
has a late-onset autosomal recessive form of OPMD 
(Fig. 2E). Case IX, who has a recessive form of OPMD, 
is shown to have inherited two copies of the (GCO7 
polymorphism . 

This is the first - description of short 
trinucleotide repeat expansions causing a human 
disease. The addition of only two GCG repeats is 
sufficient to cause dominant OPMD. OPMD expansions do 
not share the cardinal features of "dynamic mutations". 
The GCG expansions are not only short they are also 
meiotically quite stable. Furthermore, there is a clear 
cut-off between the normal and abnormal alleles, a 
single GCG expansion causing a recessive phenotype. The 
PAB II (GCG) 7 allele is the first example of a 
relatively frequent allele which can act as either a 
modifier of a dominant phenotype or as a recessive 
mutation. This dosage effect is reminiscent of the one 
observed in a homo zygote for two dominant 
synpolydactyly mutations. In this case, the patient had 
more severe deformities because she inherited two 
duplications causing an expansion in the polyalanine 
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tract of the HOXD13 protein (Akarsu, A.N. et al., Hum. 
Mol. Genet. 5, 945-952 (1996))- A duplication causing a 
similar polyalanine expansion in the a subunit 1 gene 
of the core-binding transcription factor (CBFal) has 
also been found to cause dominant cleidocranial 
dysplasia (Mundlos, S. et al-, Cell 83, 773-779 
(1997)). The mutations in these two rare diseases are 
not triplet-repeats. The are duplications of "cryptic 
repeats" composed of mixed synonymous codons and are 
thought to result from unequal crossing over (Warren, 
S.T., Science 275 , 408-409 (1997))- In the case of 
OPMD, slippage during replication causing a reiteration 
of the GCG codon is a more likely mechanism (Wells, 
D.R., J. Biol. Chem. 271 , 2875-2878 (1996)). 

Different observations converge to suggest that 
a gain of function of PAB II may cause the accumulation 
of nuclear filaments observed in OPMD (Tome, F.M.S. & 
Fardeau, Acta' Neuropath. 49/ 85-87 (1980)). PAB II is 
found mostly in dimeric and oligomer ic form (Nemeth, A. 
et al., Nucleic Acids Res. 23, 4034-4041 (1995)). It is 
possible that the polyalanine tract plays a role in 
polymerization. Polyalanine stretches have been found 
in many other nuclear proteins such as the HOX 
proteins, but their functions is still unknown (Davies, 
S.W. et al., Cell 9fl, 537-548 (1997)). Alanine is a 
highly hydrophobic amino acid present in the cores of 
proteins. In dragline spider silk, polyalanine 
stretches are thought to form B-sheet structures 
important in ensuring the fibers' strength (Simmons, 
A-H. et al., Science 271 , 84-87 (1996)). Polyalanine 
oligomers have also been shown to be extremely 
resistant to chemical denaturation and enzymatic 
degradation (Forood, B. et al., Bioch. and Blophy. Res. 
Com. 211, 7-13 (1995)). One can speculate that PAB II 
oligomers comprised of a sufficient number of mutated 
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molecules might accumulate in the nuclei by forming 
undegradable polyalanine rich macromolecules . The rate 
of the accumulation would then depend on the ratio of 
mutated to non -mutated protein. The more severe 
phenotypes observed in homozygotes for the (GCG)9 
mutations and compound heterozygotes for the (GCG)9 
mutation and (GCG)7 allele may correspond to the fact 
that in these cases PAB II oligomers are composed only 
of mutated proteins. The ensuing faster filament 
accumulation could cause accelerated cell death. The 
recent description of nuclear filament inclusions in 
Huntington's disease, raises the possibility that 
"nuclear toxicity" caused by the accumulation of 
mutated homopolymeric domains is involved in the 
molecular pathophysiology of other triplet-repeat 
diseases (Davies, S.W. et al.. Cell Sfl, 537-548 (1997); 
Scherzinger, E . et al.. Cell 90/ 549-558 (1997); 
DiFiglia, M. et al-, Science 277 . 1990-1993 (1997)). 
Future immunocytochemical and expression studies will 
be able to test this pathophysiological hypothesis and 
provide some insight into why certain muscle groups are 
more affected while all tissues express PAB II. 
Methods 

Contig and cDNA selection 

The cosmid contig was constructed by standard 
cosmid walking techniques using a gridded chromosome 
14-specific cosmid library (Evans, G. A. et al.. Gene 
79 , 9-20 (1989)). The cDNA clones were isolated by cDNA 
selection as previously described (Rommens, J.M. et 
al., in Proceedings of the third International workshop 
on the identification of transcribed sequences (eds. 
Hochgeschwender , U. & Gardiner, K. ) 65-79 (Plenum, New 
York, 1994)). 

Cloning of the PAB II gene. Three cDNA clones 
corresponding to PAB II were sequenced (Sequenase, 
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USB). Clones were verified to map to cosmids by 
Southern hybridization- The 8 kb Hindi II restriction 
fragment was subcloned from cosmid 166G8 into 
pBluescriptll (SK) ( Stratagene ) . The clone was 
sequenced using primers derived from the bPABII gene 
and human EST sequences. Sequencing of the PAB II 
introns was done by primer walking. 

PAB II mutation screening and sequencing. All 
cases were diagnosed as having OPMD on clinical grounds 
(Brais, B. et al., Hum. Mol. Genet. 4, 429-434 (1995))- 
RT-PCR- and genomic SSCP analyses were done using 
standard protocols (Laf renidre, R.G. et al., Nat. 
Genet. 15, 298-302 (1997)). The primers used to amplify 
the PAB II mutated region were: 5 ' -CGCAGTGCCCCGCCTTAGA— 
3 • and 5 * -ACAAGATGGCGCCGCCGCCCCGGC-3 ' . PCR reactions 
were performed in a total volume of 15 ml containing: 
40 ng of genomic DNA; 1.5 rag of BSA; 1 raM of each 
primer; 250 mM dCTP and dTTP; 25 mM dATP; 125 mM of 
dGTP and 125 mM of 7-deaza-dGTP (Pharmacia); 7.5% 
DMSO; 3.75 mCi[35S]dATP, 1.5 unit of Taq DNA polymerase 
and 1.5 mM MgCl2 (Perkin Elmer). For non-radioactive 
PCR reactions the [35S]dATP was replaced by 225 mM of 
dATP. The amplification procedure consisted of an 
initial denaturation step at 95 °C for five minutes, 
followed by 35 cycles of denaturation at 95 °C for 15 s, 
annealing at 70 °C for 30 s, elongation at 74 °C for 30 s 
and a final elongation at 74 °C for 7 min. Samples were 
loaded on 5% polyacrylamide denaturing gels . Following 
electrophoresis, gels were dried and autoradiographs 
were obtained. Sizes of the inserts were determined by 
comparing to a standard M13 sequence (Sequenase, USB). 
Fragments used for sequencing were gel-purified. 
Sequencing of the mutated fragment using the Amplicycle 
kit ( Perkin Elmer ) was done with the 5 ' - 
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CGCAGTGCCCCGCCTTAGAGGTG-3 1 primer at an elongation 
temperature of 68 °C. 

Stability of (GCG) -repeat expansions- The 
meiotic stability of the (GCG) 9-repeat was estimated 
based on our large French Canadian OPMD cohort. We 
previously established that a single ancestral OPMD 
carrier chromosome was introduced in the French 
Canadian population by three sisters in 1648. Seventy 
of the seventy one French Canadian OPMD families tested 
to date segregate a (GCG) 9 expansion. However, in 
family F151, the affected brother and sister, despite 
sharing the French Canadian ancestral haplotype, carry 
a (GCG) 12 expansion twice the size of the ancestral 
(GCG) 9 mutation (Fig. 2C). In our founder effect study, 
we estimated that 450 (304^594) historical meioses 
shaped the 123 OPMD cases belonging to 42 of the 71 
enrolled families. Our screening of our full set of 
participants allowed us to identify another,. 148 (<3CQ)9 
carrier chromosomes. Therefore, we estimate that a 
single mutation of the (GCG) 9 expansion has occurred in 
598 (452-742) meioses. 

Genotype-phenotype correlations- 176 carriers 
of at least one copy of the (GCG) 9 mutation were 
examined during the early stage of the linkage study. 
All were asked to swallow 80 cc of ice-cold water as 
rapidly as possible. Testing was stopped after 60 
seconds - The swallowing time ( st ) was validated as a 
sensitive test to identify OPMD cases (Brais, B. et 
al*, Hum. Mol. Genet. 4, 429-434 (1995); Bouchard, J.- 
P. et al.. Can. J. Neurol. Sci. 12, 296-297 (1992)). 
The st values for 76 (GCG) 6 homozygotes normal controls 
is illustrated in Fig. 3. Analyses of variance were 
computed by two-way ANOVA (SYSTAT package). For the 
(GCG) 9 homozygotes their mean st value was compared to 
the mean value for all (GCG) 9 heterozygotes aged 35-40 
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(P < 10" 5 ). For the (GCG)9 and (GCG)7 compound 
heterozygotes their mean st value was compared to the 
mean value for all (GCG)9 heterozygotes aged 45-65 (P < 
10" S ). 

While the invention has been described in con- 
nection with specific embodiments thereof, it will be 
understood that it is capable of further modifications 
and this application is intended to cover any varia- 
tions, uses, or adaptations of the invention following, 
in general, the principles of the invention and 
including such departures from the present disclosure 
as come within known or customary practice within the 
art to which the invention pertains and as may be 
applied to the essential features hereinbefore set 
forth, and as follows in the scope of the appended 
claims . 
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Tlae embodiments of the invention in which an exclusive 
property or privilege is claimed are defined as 
follows : 

1- A human RAB II gene containing transcribed 

polymorphic GCG repeat, which comprises a sequence as 
set forth in Fig. 4, which includes introns and 
flanking genomic sequence. 

2. The gene of claim 1, wherein allelic variants 
of GCG repeat are associated with a disease related 
with protein accumulation in nucleus. 

3. The gene of claim 2, wherein said protein 
accumulation is polyalanine accumulation. 

4 . The gene of claim 1 , wherein allelic variants 
of GCG repeat are associated with a disease related 
with swallowing difficulties. 

5. The gene of claim 1, wherein said disease is 
oculopharyngeal muscular dystrophy. 

6- A method for the diagnosis of a disease with 

protein accumulation in nucleus, which comprises the 
steps of: 

a) obtaining a nucleic acid sample of said 
patient; and 

b) determining allelic variants of GCG repeat of 
the gene of claim 1, and wherein long allelic 
variants are indicative of a disease related 
with protein accumulation in nucleus . 

7. The method of claim 6, wherein said disease is 

oculopharyngeal muscular dystrophy. 



CA 02218199 1997-12-09 



- 15 - 



8. The method of claim 7, wherein said long 

allelic variants have from about 245 to about 263 bp in 
length - 

9- A non-human mammal model for the PAB II gene of 

claim 1, whose germ cells and somatic cells are 
modified to express at least one allelic variant of the 
PAB II gene and wherein said allelic variant of the PAB 
II being introduced into the mammal, or an ancestor of 
the mammal, at an embryonic stage. 

10 - A method for the screening of therapeutic 
agents for the prevention and/or treatment of 
oculopharyngeal muscular dystrophy, which comprises the 
steps of: 

a) administering said therapeutic agents to the 
non-human mammal of claim 9 or oculopharyngeal 
muscular dystrophy patients; and 

b) evaluating the prevention and/or treatment of 
development of oculopharyngeal muscular 
dystrophy in said mammal or said patients. 

11- A method to identify genes * part of or 

interacting with a biochemical pathway affected by PAB 

11 gene, which comprises the steps of: 

a) designing probes and/or primers using the hGTl 
gene of claim 1 and screening oculopharyngeal 
muscular dystrophy patients samples with said 
probes and/or primers; and 

b) evaluating the identified gene role in 
oculopharyngeal muscular dystrophy patients. 
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A : (GCG)9+(GCG) 7 
■ : (GCG)9+(GCG) 9 




Fig. 3 
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aatgaaggtg 
aagcagcaca 
accgcccctc 
gccgtggaca 
tgattgaagg 
gagaagagtg 
tgatttacgg 
agattggagg 
aggcctacgc 
tcctccctgc 
gaaagtaagc 
ctactggtgc 
cacccggcca 
ctggctagtc 
aaggttcacc 
tgagactggg 
gggccaatca 
aatatcgtca 
ttgattgaca 
aggtgcgctt 
ggcaggcagc 
gcgggcccca 
tgcgggcggt 
ggaggccggg 
ggaactggag 
gccgccccgg 
cggcagccaa 
cattgaggac 
gaggcccaga 
tggctggggc 
tgattcgggc 
gaccctcgca 
cagggcacag 
ttaggagctg 
aaaggagcta 
tgagtaactg 
atggggaatg 
tgtgccgcgg 
ttggtggtgg 
tagtctacgt 
atagtfctttc 
tttccccttt 
aagcatttca 
tactgtfctta 
tatgcactag 
tttgccttca 
tagaaaacaa 
tcatgtccat 
cgtactgggg 
tcctgaaaga 
ggttaaaggt 
attaattcct 
ctcgggaggg 
atcaggtgga 
cagtcaaccg 
aggggagtaa 
ttgagcagta 
taatgttgat 
aggctagatg 
tatggcatat 
ggaaatttta 
acaaataaaa 



gacacccaaa 
tctatgtggt 
atctgcaggc 
taggcccact 
gggactacat 
gaaggaaaca 
ttttgagact 
ggtgacattg 
aagagggcgg 
tttctggtgc 
tccctcctgg 
ttcctggtcg 
acagctcact 
atgtgacctc 
tgtcacgaaa 
ccactgcggt 
cggagcgtcc 
cagcgtggcg 
ggcagatttc 
atttgattgc 
ttgagctaat 
gtctgagcgg 
cggggctccg 
gagggggccc 
cctgaggagc 
ccccgcgccc 
gaggaggagg 
ccggtgagga 
gctcgggcga 
gggtcgggcc 
gtcacgggtg 
tggggcgagg 
cccctgcgtt 
gaagctatca 
cagaacgagg 
gcggttgcac 
tggggttaga 
tcatagtccg 
tagccttgtg 
ctatctttct 
ctccaattgg 
aaattctaga 
accaaagcca 
agtgtgtatt 
gcactattct 
ctgagcttat 
gtgtgtggtt 
tgaggagaag 
ctctgactgg 
acatctccgg 
aatggaatga 
caaattacca 
ttcttttgag 
ctatggtgca 
tgttaccata 
gttgagataa 
agttatttgg 
atatcaggag 
tgggtgggat 
ggaaattcag 
aaatttaaat 
aatataactg 



tagccccaat 
agcatattgc 
gctcacaacc 
tgtcctggga 
gttagaggca 
acatccacaa 
ttacctcgcc 
gaagctgtcc 
gacagacagg 
gggagagcta 
aatgcttcat 
agatacaagt 
agctggcaag 
gggtttccca 
cgagtgtcac 
gaggcgatcg 
catacttcgc 
gtattattac 
cctaccggga 
caagtaatat 
gagtcctccg 
cgatggcggc 
ggccggggcg 

cggggggcgc 

tgctgctgga 
ccccgggagc 
aggagccggg 
aggagggcga 
gcggtggcag 
ggggatgggt 
cctagtgttg 
gaaatggccg 
ggttcctctt 
aagctcgagt 
tagagaagca 
gcggagcccg 
tactcggcac 
ttgtgtgttc 
cctccctttg 
ttggtagagg 
agacgcttta 
aatgtggagt 
ttcattaggg 
aattctttca 
cggcttgtgg 
gggatagtgc 
tttgtaaaaa 
atggaggctg 
ggttgggggc 
gatagatgtg 
tcagtaatca 
gatttcatgt 
acaggaattt 
acagcagaag 
ctgtgtgaca 
tttaaattac 
tgttaacaca 
ttgcacctaa 
tacgaactag 
gccctgtgtg 
gatttcgaat 
cattgtagcc 



acaaatgcct 
caggccgtga 
tagttagcaa 
aatgagggga 
cagactgggt 
agtaaccaca 
agcaaagggg 
aggaaaaaga 
acttgtgact 
gtggatgatg 
tcacaacctc 
ttcctgaaac 
cagtagtatc 
agtttgaagc 
cccttcgact 
gaagattggt 
gggcccgccc 
ctaaggactc 
tttgagaatt 
tccccaatgg 
tggccggcgc 
ggcggcggcg 
gcggcgccat 
aggggactac 
gcccgagccg 
tccgggccct 
actggtcgag 
gcgagcaggc 
gcggggggtg 
cagcgatcac 
ttctagagag 
agcatggctg 
aagctgtcct 
cagggagatg 
gatgaatatg 
ggttctcggg 
cctggagctg 
ctctgacctt 
tcctgttata 
ttgcgtgctc 
ggattctaag 
ctcagcccac 
atttgatttg 
atttatcgaa 
gtacagcagg 
tggtggtgga 
attatttttt 
atgcccgttc 
aagttcttct 
gttttgggtg 
gcaaaggctc 
gctttggtgt 
gcctggtgcc 
agctggaagc 
aatttagtgg 
agtgtacaaa 
ggtgatctgt 
atgtcttcag 
aaggggaggg 
tcttattttt 
gattgaaatt 
caaaacgaag 



gttcaatcaa 
gactgcgaat 

acagtaaaac 
agctggggtt 
gcaggtacac 
tgctggcgta 
ggccagtctg 
aaatggaact 
agtagctctg 
gtgccaataa 
cattttcagc 
tgctgctctg 
aagatggcgg 
ccggcagtcc 
ctcgcaagcc 
cctttccagt 
gtaggccggg 
gataggaggt 
tggcgcagtg 
agtactagct 
agctctccac 
gcggcagcag 
cttgtgcccg 
gggaacggcc 
gagcccgagc 
gggcctggtt 
ggtgacccgg 
cggcggctgg 
gggttgggcg 
tacaaggggc 
ggtagctttt 
aggcgcgctc 
ccataccctc 
gaggaagaag 
agtccacctc 
ttggaagggt 
cttgtctgag 
fcgtgaggcag 
attgtgttgc 
gcatttgacc 
agaaagcaag 
ttaattttgc 
gagggcagga 
ttatttagtg 
gaacagcaca 
agtgcaacat 
cctgatagct 
catctatgtt 
tttggggaat 
tggagggagt 
tgggtttgga 
atgatggccc 
tgtgaaattt 
tcactttcat 
ccatcccaaa 
tagataaatt 
gtcat'tfcaag 
aggccagata 
gcagctjtcta 
acaaatfttca 
ttccatttag 
catgccfegca 



ccaaacatct 
ataaatagga 

aattaagcgc 
tgcagtggtt 
ccaaaggaac 
tcgaaggccg 
ttagcggtgc 
ggggagcaga 
gactgaggaa 
cctggatggg 
aacatcccat 
ttttgggcct 
ccccctagga 
tttcgggggc 
aatcggcatc 
cgcctagcta 
gagaagcagg 
gggacgcgtg 
cccgccttag 
catggtgacg 
atgccgggcg 
cagcgggggc 
gggccggtgg 
tggagtctga 
ccgaagagga 
cgggagcccc 
gggacggcgc 
cgcgtcactg 
gggaataacg 
ccgactggct 
cttttatcac 
tggccgagag 
cccacttata 
ctgagaagct 
caggcaatgc 
tgtggggagg 
ctattatgac 
aactgatatt 
tctttattct 
ttcaaatcta 
ctggaagggg 
tcactcttaa 
gggattccta 
agtaacctgc 
gaccaaaatc 
attggtcaag 
ggcccggtga 
ggcaatgtga 
tatttaatag 
gtgggaagga 
aggaaaagag 
agaccaaagg 
ttctcctctc 
ggctgtggtt 
ggtaaagtaa 
atgttttata 
atcatggcat 
acaaaaatga 
cttggcctat 
aagagtagct 
aagaattttg 
ggttgaattt 
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3721 gacctgtgag gtatttgtaa cctcagagag 
3781 atatagagtt ctcagacaaa gagtcagtga 
3841 ttagaggaag gcaaatcaag gtaagcctat 
3901 ctccaggttg cctttaaggc tatcatttgt 
3961 accaacagac caggcatcag cacaacagac 
4021 cggaccacca actacaacag ctcccgctct 
4081 cggggtcgcg tctacaggtc aggatagatg 
4141 gccccgtatg cttcctcctc tctggtctga 
4201 gtcttcagga actttgtctc ctgcctgtgc 
4261 agaaggcagc ctcatcatct tttctgcagt 
4321 cttggttcaa agaggcttcc acccccagcc 
4381 aggtgtttgc ggacaaaact gggaggaaca 
4441 gacatttgtt acttttttcg gagttaggga 
4501 ataccagagg ctagctagtt gatcctccca 
4561 attctttatt tgagccagtc ttgcaaggtt 
4621 ggtttttgcc ttgcttcact tctgtctcta 
4681 cttggctttt cataagctct acctgcctat 
4741 aggccctagg gfcttaaaaac tgtggaggac 
4801 ccttgcccct gtctctcact cagatgcgct 
4861 tgttaagccc ccctccccct gccccagttc 
4921 ggggtcggtt ttzaggacact tgaacacttc 
4981 gcaggggcct acggggaggg gcttgtactg 
5041 tctccttctt tcttccaggg gccgggctag 
5101 aaagtgtgta ttaggaggag agagaggaaa 
5161 taaaaaaaaa aaaaagaaaa acagaagatg 
5221 aaaagatata ctgtggaagg ggggagaatc 
5281 ttggggagta ggggaaggcc cagggagtgg 
5341 tcgccatgga cacgtctcaa ctgcgcaagc 
5401 ccccttgggc ctgctcaagg gtaggtgggc 
5461 ggctctggaa ggacaccaaa ctgttctgct 
5521 tttcacagtc ccctcctgcc tgctcctgtc 
5581 tttctccggc tccctgcccc tccagattgc 
5641 tctttttctg ttttgagtgt ctttctttgc 
5701 gctcccagcg gctccagtgt aaattcccct 
5761 ggggggttta ggggtgtttt tgtttttcag 
5821 tttgcctttt ttccctttta tttggaggga 
5881 ggtggatttt gtttattttt ttagctcatt 
5941 gtcatgaata aagttgtttt tgaaaataaa 
6001 aa 



atacaatgac aattcttttc aggtttgcgt 
ggacttcctt ggccttagat gagtccctat 
gtccattgct gttctagttg tgtataaact 
tcatctctga ctcaggtgat cccaaaacga 
cggggttttc cacgagcccg ctaccgcgcc 
cgattctaca gtggtfcttaa cagcaggccc 
ggctgctcct ctttcccccg cctcccgtga 
ggaacctccc tccccccacc cctccccgtg 
aggttgagga aggtagttgc aggccaggcc 
agaaattggt gataagggct gcatccctcc 
ttttttttct tgggagttgg tggcatttga 
gggcctccag gaagttgaaa gcactgcttg 
gggattgaag actgaacctc ccttggaaga 
acagccttgt gggaggattt tgagatactt 
aacttctcac tgggcctagt gtggtnccca 
catttaaata gacgggttag gcatataaac 
ccccaggagt tagggaggat ctatttgtga 
tgaaaaactg gataaaaagg gggtcctttt 
tctttttcgc cactgtttgg caaagttttc 
tcccaggtgc gttactattt ctgggatcat 
ttttcccccc ttcccttcac agtaactggg 
aactatctag tgatcacgtt aacacctaac 
agcgacatca tggtattccc cttactaaaa 
aaaagaggaa agaaggaaaa aaaaaagaat 
accttgatgg aaaaaaaata ttttttaaaa 
ccataactaa ctgctgagga gggacctgct 
ggcagggggc tgcttattca ctctggggat 
tgcttgccca tgtttccctg cccccttcac 
gtgggtggta ggagggtttt ttttacccag 
tgttaccttc cctcccgtct tctcctcgcc 
cagccaggtc taccacccac cccacccctc 
ctggtgatct attttgtttc cttttgtgtt 
aggtttctgt agccggaaga tctccgttcc 
tccccctggg gaaatgcact accttgtttt 
ttgttttgtt tttttgtttt ttttttttcc 
atgggaggaa gtgggaacag ggaggtggga 
tccaggggtg ggaatttttt tttaatatgt 
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
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